-
Notifications
You must be signed in to change notification settings - Fork 274
Fix handling of DEAD instructions and function call inlining #6473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #6473 +/- ##
========================================
Coverage 76.02% 76.02%
========================================
Files 1546 1546
Lines 165352 165359 +7
========================================
+ Hits 125711 125717 +6
- Misses 39641 39642 +1
Continue to review full report at Codecov.
|
a4b8ca5
to
c4dc070
Compare
When handling DEAD instructions in function/loop assigns clause instrumentation, the instruction iterator was not being incremented correctly. This led to instrumentation outside of the function/loop scope, and spurious write set inclusion violations. Moreover, for loops, declarations of some temporaries (those involved in the "initialization" of loop counter, for instance) are "outside" of the loop identified by CBMC, so we no longer raise an exception on not finding a corresponding DECL for each DEAD. These variables are writable because they appear as assigns clause targets, not because they were local DECLs.
Function call inlining was earlier performed on the same function multiple times if it had multiple loops. We refactor the inlining call out and only inline the function once, even if it has multiple loops.
Previously, only loop body was being instrumented for write set inclusion checks. This could miss checking writes performed by the loop guard, if it has side effects. In this PR, we also instrument the loop guard with inclusion checks.
c4dc070
to
abd36cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing some tests with actual side effects in the guard evaluation, and unresolved issue with decreases clause evaluation
// FIXME: This simple approach wouldn't work when | ||
// the loop guard in the source file is split across multiple lines. | ||
const auto head_loc = loop_head->source_location(); | ||
while(loop_head->source_location() == head_loc) | ||
while(loop_head->source_location() == head_loc || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fragile. An alternative solution could be to have the C front end inject LOCATION markers to precisely delimit loop entry, loop guard evaluation, loop condition test, loop body, loop step instructions, jump back to head instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, @martin-cs suggested something similar (keeping some markers around), in my other PR.
I would suggest not to fix this this in the same PR especially because it would touch the C front end (parsing functions etc.) This PR is tiny and it only fixes the assigns clause issues (with DEAD
) and makes it play well with function call inlining (modulo the existing issue with loop body finding).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with both. I have already had Opinions at @SaswatPadhi about loop structure detection (@remi-delmas-3000 did I CC you? If not I can resend. ); this is fragile and is asking for trouble and inconsistency with the rest of CPROVER. I think this functionality should be factored out and used in all appropriate places.
But I also agree with @SaswatPadhi ; that is outside of the scope of this PR.
|
||
void main() | ||
{ | ||
for(int i = lowerbound(); i < upperbound(); incr(&i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we also add another simple test such as
int main()
{
unsigned int max;
assume(max > 0);
unsigned int i=0;
while( i < max++)
loop_decreases(max-i)
{
i++;
}
}
This loop's decreases clause is incorrect, max and i can both overflow
(but the loop still terminates when max overflows and becomes smaller than i or stays at max_int and i starts catching up on it)
We should be able to find all these errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that you should be able to find these and agreed that this would be worth adding as a test.
Not entirely sure how this connects to the specific issue being fixed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a separate issue from the one addressed in this PR. @remi-delmas-3000 is there an issue where we track this?
if(!natural_loops.loop_map.size()) | ||
return; | ||
|
||
goto_function_inline( | ||
goto_functions, function_name, ns, log.get_message_handler()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inlining after loop detection will miss loops that are hidden behind function calls that do not have contracts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is okay? because they don't have contracts anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't object but do note that this kind of inlining can get expensive.
log.warning() << "Found a `DEAD` variable " | ||
<< name2string(symbol.get_identifier()) | ||
<< " without corresponding `DECL`, at: " | ||
<< instruction_it->source_location() << messaget::eom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to warn about constructs the user has no control over ? How could we distinguish such expected cases from really incorrect cases ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably use log.debug
or log.info
, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I am not sure it is "logging at run-time to the user" issue, in part because, as @remi-delmas-3000 says, it is not clear what the user is supposed to do with this information.
This feels like a DATA_INVARIANT
of a goto-programs; you can only DEAD
a variable that you have previously DECL
d. This seems reasonable and feels like the kind of thing that should be in the --validate-goto-model
checks.
Thanks for taking a look, @remi-delmas-3000
Guards with side effects were out of scope until now. This PR only changes assigns clause instrumentation to happen on the entire loop that CBMC identifies, rather than just the body. We haven't tested loops with guards that have side effects though, and there could be other issues with the invariant assumption / havoc statement / base case assertion placement, that we might want to fix before we can get decreases clauses working. I would suggest keeping this PR only about assigns clause fixes (both function & loop contracts) and making separate PRs each for fixing issues with loops that with guards that have side effects, handling |
approving to merge but we need to keep in mind the issues mentioned in my review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not going to block this over the logging but please could you address it?
( Also thanks for the really well split-out changes and clear commit messages, makes it so much easier to read. )
|
||
void main() | ||
{ | ||
for(int i = lowerbound(); i < upperbound(); incr(&i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that you should be able to find these and agreed that this would be worth adding as a test.
Not entirely sure how this connects to the specific issue being fixed though.
log.warning() << "Found a `DEAD` variable " | ||
<< name2string(symbol.get_identifier()) | ||
<< " without corresponding `DECL`, at: " | ||
<< instruction_it->source_location() << messaget::eom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I am not sure it is "logging at run-time to the user" issue, in part because, as @remi-delmas-3000 says, it is not clear what the user is supposed to do with this information.
This feels like a DATA_INVARIANT
of a goto-programs; you can only DEAD
a variable that you have previously DECL
d. This seems reasonable and feels like the kind of thing that should be in the --validate-goto-model
checks.
// FIXME: This simple approach wouldn't work when | ||
// the loop guard in the source file is split across multiple lines. | ||
const auto head_loc = loop_head->source_location(); | ||
while(loop_head->source_location() == head_loc) | ||
while(loop_head->source_location() == head_loc || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with both. I have already had Opinions at @SaswatPadhi about loop structure detection (@remi-delmas-3000 did I CC you? If not I can resend. ); this is fragile and is asking for trouble and inconsistency with the rest of CPROVER. I think this functionality should be factored out and used in all appropriate places.
But I also agree with @SaswatPadhi ; that is outside of the scope of this PR.
if(!natural_loops.loop_map.size()) | ||
return; | ||
|
||
goto_function_inline( | ||
goto_functions, function_name, ns, log.get_message_handler()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't object but do note that this kind of inlining can get expensive.
|
||
void main() | ||
{ | ||
for(int i = lowerbound(); i < upperbound(); incr(&i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a separate issue from the one addressed in this PR. @remi-delmas-3000 is there an issue where we track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was too late to the party: I'll create a follow-up PR to fix the issues I called out.
// they must appear as assigns targets anyway, | ||
// but their DECL statements are outside of the loop. | ||
log.warning() << "Found a `DEAD` variable " | ||
<< name2string(symbol.get_identifier()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name2string
seems to be rarely used (to the extent that I'm wondering whether we should get rid of it). I'd suggest you use id2string
instead, though here you actually don't need either!
return; | ||
|
||
goto_function_inline( | ||
goto_functions, function_name, ns, log.get_message_handler()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline: calls to goto_function_inline
will require goto_functions.update()
to be called afterwards.
// Insert aliasing assertions | ||
// Inline all function calls. | ||
goto_function_inline( | ||
goto_functions, function_obj->first, ns, log.get_message_handler()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above: call goto_functions.update()
In this PR, we fix 3 minor issues related to code contracts:
DEAD
instructions during write set inclusion checking.The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/My commit message includes data points confirming performance improvements (if claimed).White-space or formatting changes outside the feature-related changed lines are in commits of their own.